Audio head pose estimation using the direct to reverberant speech ratio
نویسندگان
چکیده
منابع مشابه
PSD estimation in Beamspace for Estimating Direct-to-Reverberant Ratio from A Reverberant Speech Signal
A method for estimation of direct-to-reverberant ratio (DRR) using a microphone array is proposed. The proposed method estimates the power spectral density (PSD) of the direct sound and the reverberation using the algorithm PSD estimation in beamspace with a microphone array and calculates the DRR of the observed signal. The speech corpus of the ACE (Acoustic Characterisation of Environments) C...
متن کاملDirect-to-Reverberant Ratio Estimation on the ACE Corpus Using a Two-channel Beamformer
Direct-to-Reverberant Ratio (DRR) is an important measure for characterizing the properties of a room. The recently proposed DRR Estimation using a Null-Steered Beamformer (DENBE) algorithm was originally tested on simulated data where noise was artificially added to the speech after convolution with impulse responses simulated using the image-source method. This paper evaluates the performance...
متن کاملEstimation of the direct-to-reverberant Energy Ratio using a spherical microphone array
This paper proposes a practical approach to estimate the direct-toreverberant energy ratio (DRR) using a spherical microphone array without having knowledge of the source signal. We base our estimation on a theoretical relationship between the DRR and the coherence estimation function between coincident pressure and particle velocity. We discuss the proposed method’s ability to estimate the DRR...
متن کاملSampling techniques for audio-visual tracking and head pose estimation
Analyzing people behaviors in smart environment using multimodal sensors requires to answer a set of typical questions: who are the people, where are they, what activities are they doing, when, with whom are they interacting, and how. In this view, locating people or their faces and characterizing them (e.g. extracting their body or head orientation) allows to address the first two questions (w...
متن کاملIdeal Ratio Mask Estimation Using Deep Neural Networks for Monaural Speech Segregation in Noisy Reverberant Conditions
Monaural speech segregation is an important problem in robust speech processing and has been formulated as a supervised learning problem. In supervised learning methods, the ideal binary mask (IBM) is usually used as the target because of its simplicity and large speech intelligibility gains. Recently, the ideal ratio mask (IRM) has been found to improve the speech quality over the IBM. However...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Speech Communication
سال: 2016
ISSN: 0167-6393
DOI: 10.1016/j.specom.2016.09.005